我们提出了一个新的图神经网络(GNN)模块,该模块基于最近提出的几何散射变换的松弛,该变换由图形小波滤波器组成。我们可学习的几何散射(腿)模块可以使小波的自适应调整能够鼓励乐队通道特征在学习的表示中出现。与许多流行的GNN相比,我们的腿部模块在GNN中的结合能够学习长期图形关系,这些GNN通常依赖于邻居之间的平滑度或相似性来编码图形结构。此外,与竞争性GNN相比,其小波先验会导致简化的架构,学到的参数明显少得多。我们证明了基于腿的网络在图形分类基准上的预测性能,以及在生化图数据探索任务中学到的功能的描述性质量。我们的结果表明,基于腿部的网络匹配或匹配流行的GNN,以及在许多数据集上,尤其是在生化域中的原始几何散射结构,同时保留了手工制作的(非学习)几何散射的某些数学特性。
translated by 谷歌翻译
图形神经网络(GNNS)通过考虑其内在的几何形状来扩展神经网络的成功到图形结构化数据。尽管根据图表学习基准的集合,已经对开发具有卓越性能的GNN模型进行了广泛的研究,但目前尚不清楚其探测给定模型的哪些方面。例如,他们在多大程度上测试模型利用图形结构与节点特征的能力?在这里,我们开发了一种原则性的方法来根据$ \ textit {敏感性配置文件} $进行基准测试数据集,该方法基于由于图形扰动的集合而导致的GNN性能变化了多少。我们的数据驱动分析提供了对GNN利用哪些基准测试数据特性的更深入的了解。因此,我们的分类法可以帮助选择和开发适当的图基准测试,并更好地评估未来的GNN方法。最后,我们在$ \ texttt {gtaxogym} $软件包中的方法和实现可扩展到多个图形预测任务类型和未来数据集。
translated by 谷歌翻译
We propose a geometric scattering-based graph neural network (GNN) for approximating solutions of the NP-hard maximum clique (MC) problem. We construct a loss function with two terms, one which encourages the network to find highly connected nodes and the other which acts as a surrogate for the constraint that the nodes form a clique. We then use this loss to train an efficient GNN architecture that outputs a vector representing the probability for each node to be part of the MC and apply a rule-based decoder to make our final prediction. The incorporation of the scattering transform alleviates the so-called oversmoothing problem that is often encountered in GNNs and would degrade the performance of our proposed setup. Our empirical results demonstrate that our method outperforms representative GNN baselines in terms of solution accuracy and inference speed as well as conventional solvers like Gurobi with limited time budgets. Furthermore, our scattering model is very parameter efficient with only $\sim$ 0.1\% of the number of parameters compared to previous GNN baseline models.
translated by 谷歌翻译
几何深度学习取得了长足的进步,旨在概括从传统领域到非欧几里得群岛的结构感知神经网络的设计,从而引起图形神经网络(GNN),这些神经网络(GNN)可以应用于形成的图形结构数据,例如社会,例如,网络,生物化学和材料科学。尤其是受欧几里得对应物的启发,尤其是图形卷积网络(GCN)通过提取结构感知功能来成功处理图形数据。但是,当前的GNN模型通常受到各种现象的限制,这些现象限制了其表达能力和推广到更复杂的图形数据集的能力。大多数模型基本上依赖于通过本地平均操作对图形信号的低通滤波,从而导致过度平滑。此外,为了避免严重的过度厚度,大多数流行的GCN式网络往往是较浅的,并且具有狭窄的接收场,导致侵犯。在这里,我们提出了一个混合GNN框架,该框架将传统的GCN过滤器与通过几何散射定义的带通滤波器相结合。我们进一步介绍了一个注意框架,该框架允许该模型在节点级别上从不同过滤器的组合信息进行本地参与。我们的理论结果确定了散射过滤器的互补益处,以利用图表中的结构信息,而我们的实验显示了我们方法对各种学习任务的好处。
translated by 谷歌翻译
Segmentation of lidar data is a task that provides rich, point-wise information about the environment of robots or autonomous vehicles. Currently best performing neural networks for lidar segmentation are fine-tuned to specific datasets. Switching the lidar sensor without retraining on a big set of annotated data from the new sensor creates a domain shift, which causes the network performance to drop drastically. In this work we propose a new method for lidar domain adaption, in which we use annotated panoptic lidar datasets and recreate the recorded scenes in the structure of a different lidar sensor. We narrow the domain gap to the target data by recreating panoptic data from one domain in another and mixing the generated data with parts of (pseudo) labeled target domain data. Our method improves the nuScenes to SemanticKITTI unsupervised domain adaptation performance by 15.2 mean Intersection over Union points (mIoU) and by 48.3 mIoU in our semi-supervised approach. We demonstrate a similar improvement for the SemanticKITTI to nuScenes domain adaptation by 21.8 mIoU and 51.5 mIoU, respectively. We compare our method with two state of the art approaches for semantic lidar segmentation domain adaptation with a significant improvement for unsupervised and semi-supervised domain adaptation. Furthermore we successfully apply our proposed method to two entirely unlabeled datasets of two state of the art lidar sensors Velodyne Alpha Prime and InnovizTwo, and train well performing semantic segmentation networks for both.
translated by 谷歌翻译
Explainable AI (XAI) is slowly becoming a key component for many AI applications. Rule-based and modified backpropagation XAI approaches however often face challenges when being applied to modern model architectures including innovative layer building blocks, which is caused by two reasons. Firstly, the high flexibility of rule-based XAI methods leads to numerous potential parameterizations. Secondly, many XAI methods break the implementation-invariance axiom because they struggle with certain model components, e.g., BatchNorm layers. The latter can be addressed with model canonization, which is the process of re-structuring the model to disregard problematic components without changing the underlying function. While model canonization is straightforward for simple architectures (e.g., VGG, ResNet), it can be challenging for more complex and highly interconnected models (e.g., DenseNet). Moreover, there is only little quantifiable evidence that model canonization is beneficial for XAI. In this work, we propose canonizations for currently relevant model blocks applicable to popular deep neural network architectures,including VGG, ResNet, EfficientNet, DenseNets, as well as Relation Networks. We further suggest a XAI evaluation framework with which we quantify and compare the effect sof model canonization for various XAI methods in image classification tasks on the Pascal-VOC and ILSVRC2017 datasets, as well as for Visual Question Answering using CLEVR-XAI. Moreover, addressing the former issue outlined above, we demonstrate how our evaluation framework can be applied to perform hyperparameter search for XAI methods to optimize the quality of explanations.
translated by 谷歌翻译
Autonomous vehicles currently suffer from a time-inefficient driving style caused by uncertainty about human behavior in traffic interactions. Accurate and reliable prediction models enabling more efficient trajectory planning could make autonomous vehicles more assertive in such interactions. However, the evaluation of such models is commonly oversimplistic, ignoring the asymmetric importance of prediction errors and the heterogeneity of the datasets used for testing. We examine the potential of recasting interactions between vehicles as gap acceptance scenarios and evaluating models in this structured environment. To that end, we develop a framework facilitating the evaluation of any model, by any metric, and in any scenario. We then apply this framework to state-of-the-art prediction models, which all show themselves to be unreliable in the most safety-critical situations.
translated by 谷歌翻译
Geospatial Information Systems are used by researchers and Humanitarian Assistance and Disaster Response (HADR) practitioners to support a wide variety of important applications. However, collaboration between these actors is difficult due to the heterogeneous nature of geospatial data modalities (e.g., multi-spectral images of various resolutions, timeseries, weather data) and diversity of tasks (e.g., regression of human activity indicators or detecting forest fires). In this work, we present a roadmap towards the construction of a general-purpose neural architecture (GPNA) with a geospatial inductive bias, pre-trained on large amounts of unlabelled earth observation data in a self-supervised manner. We envision how such a model may facilitate cooperation between members of the community. We show preliminary results on the first step of the roadmap, where we instantiate an architecture that can process a wide variety of geospatial data modalities and demonstrate that it can achieve competitive performance with domain-specific architectures on tasks relating to the U.N.'s Sustainable Development Goals.
translated by 谷歌翻译
封闭的量子机械系统的物理学受哈密顿量的约束。但是,在大多数实际情况下,这种哈密顿量尚不清楚,最终所有的数据是从系统上的测量中获得的数据。在这项工作中,我们通过将基于机器学习的基于梯度的优化从机器学习中从张量量的网络中从机器学习中从基于梯度的优化中汇总到从基于梯度的优化的技术中汇总到从动力学数据中进行交互的多体汉密尔顿人来学习的家庭。我们的方法非常实用,实验友好且本质上可扩展,以使系统尺寸超过100次旋转。特别是,我们在综合数据上证明了算法的工作原理,即使仅限于一个简单的初始状态,少量的单量观测和时间演变为相对较短的时间。对于一维海森贝格模型的具体示例,我们的算法在系统大小和缩放的误差常数中作为数据集大小的反平方根。
translated by 谷歌翻译
从不同的随机初始化开始,经过随机梯度下降(SGD)训练的神经网络通常在功能上非常相似,从而提出了一个问题,即不同的SGD溶液之间是否存在有意义的差异。 Entezari等。最近猜想,尽管初始化不同,但在考虑到神经网络的置换不变性后,SGD发现的解决方案位于相同的损失谷中。具体而言,他们假设可以将SGD找到的任何两种解决方案排列,以使其参数之间的线性插值形成一条路径,而不会显着增加损失。在这里,我们使用一种简单但功能强大的算法来找到这样的排列,使我们能够获得直接的经验证据,证明该假设在完全连接的网络中是正确的。引人注目的是,我们发现在初始化时已经存在两个网络,并且平均它们随机,但适当排列的初始化的性能大大高于机会。相反,对于卷积架构,我们的证据表明该假设不存在。特别是在大型学习率制度中,SGD似乎发现了各种模式。
translated by 谷歌翻译